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' The relation between the girth and the guaranteed error correction capabiHty of 7-left regular LDPC codes 



when decoded using the bit flipping (serial and parallel) algorithms is investigated. A lower bound on the size of 

variable node sets which expand by a factor of at least 87 /4 is found based on the Moore bound. An upper bound 

on the guaranteed error correction capability is established by studying the sizes of smallest possible trapping sets. 

The results are extended to generalized LDPC codes. It is shown that generalized LDPC codes can correct a linear 

C/3 ' fraction of errors under the parallel bit flipping algorithm when the underlying Tanner graph is a good expander. 

O 

It is also shown that the bound cannot be improved when 7 is even by studying a class of trapping sets. A lower 
bound on the size of variable node sets which have the required expansion is established. 

> 

ly-^ . Low-density parity-check codes, bit flipping algorithms, trapping sets, error correction capability 

o 

00 

O ■ I. Introduction 



X 



Index Terms 



Iterative algorithms for decoding low-density parity-check (LDPC) codes [1] have been the focus of 



■ research over the past decade and most of their properties are well understood [2], [3]. These algorithms 
operate by passing messages along the edges of a graphical representation of the code known as the 
Tanner graph, and are optimal when the underlying graph is a tree. Message passing decoders perform 
remarkably well which can be attributed to their ability to correct errors beyond the traditional bounded 
distance decoding capability. However, in contrast to bounded distance decoders (BDDs), the guaranteed 
error correction capability of iterative decoders is largely unknown. 
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The problem of recovering from a fixed number of erasures is solved for iterative decoding on the 
binary erasure channel (BEC). If the size of the minimum stopping set in the Tanner graph of a code is 
at least t + 1, then the decoder is guaranteed to recover from any t erasures. Orlitsky et al. [4] studied 
the relation between stopping sets and girth and derived bounds on the smallest stopping set in any d-left 
regular Tanner graph with girth g. 

An analogous result does not exist for decoding on other channels such as the binary symmetric channel 
(BSC) and the additive white Gaussian noise (AWGN) channel. In this paper, we present such a result for 
hard decision decoding algorithms. Gallager [1] proposed two binary message passing algorithms, namely 
Gallager A and Gallager B, for decoding over the BSC. He showed that for the column-weight 7 > 3 and 
/O > 7, there exist (n, 7, p) ' regular LDPC codes for which the bit error probability asymptotically tends 
to zero whenever we operate below the threshold. The minimum distance was shown to increase linearly 
with the code length, but correction of a linear fraction of errors was not shown. Zyablov and Pinsker 
[6] analyzed LDPC codes under a simpler decoding algorithm known as the bit flipping algorithm, and 
showed that almost all the codes in the regular ensemble with 7 > 5 can correct a constant fraction of 
worst case errors. Sipser and Spielman [7] used expander graph arguments to analyze two bit flipping 
algorithms, serial and parallel. Specifically, they showed that these algorithms can correct a fraction of 
errors if the underlying Tanner graph is a good expander. Burshtein and Miller [8] applied expander 
based arguments to show that message passing algorithms can also correct a fixed fraction of worst case 
errors when the degree of each variable node is more than five. Feldman et al. [9] showed that the linear 
programming decoder [10] is also capable of correcting a fraction of errors. Recently, Burshtein [11] 
showed that regular codes with variable nodes of degree four are capable of correcting a linear number of 
errors under bit flipping algorithm. He also showed tremendous improvement in the fraction of correctable 
errors when the variable node degree is at least five. 

Tanner [5] studied a class of codes constructed based on bipartite graphs and short error correcting 
codes. Tanner's work is a generalization of the LDPC codes proposed by Gallager [1] and hence these 
codes are referred to as generalized LDPC (GLDPC) codes. Tanner proposed code construction techniques, 
decoding algorithms and complexity and performance analysis to analyze these codes and derived bounds 
on the rate and minimum distance for these codes. Sipser and Spielman [7] analyzed a special case of 
GLDPC codes (which they termed as expander codes) using expansion arguments and proposed explicit 
constructions of asymptotically good codes capable of correcting a fraction of errors. Zemor [12] improved 
the fraction of correctable errors under a modified decoding algorithm. Barg and Zemor in [13] analyzed 
the error exponents of expander codes and showed that expander codes achieve capacity over the BSC. 

'Precise definitions will be given in Section II and we follow standard terminology from [1] and [5] 
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Janwa and Lai [14] studied GLDPC codes in the most general setting by considering unbalanced bipartite 
graphs. Miladinovic and Fossorier [15] derived bounds on the guaranteed error correction capability of 
GLDPC codes for the special case of failures only decoding. 

The focus of this paper is to establish lower and upper bounds on the guaranteed error correction 
capability of LDPC codes and GLDPC codes as a function of their column-weight and girth. For the 
case of GLDPC codes, we also find the expansion required to guarantee correction of a fraction of errors 
under the parallel bit flipping algorithm, as a function of the error correction capability of the sub-code. 
Our approach can be summarized as follows: (a) to establish lower bounds, we determine the size of 
variable node sets in a left regular Tanner graph which are guaranteed to have the expansion required by 
bit flipping algorithms, based on the Moore bound [16, p. 180] and (b) to find upper bounds, we study 
the sizes of smallest possible trapping sets [17] in a left regular Tanner graph. 

It is well known that a random graph is a good expander with high probability [7]. However, the fraction 
of nodes having the required expansion is very small and hence the code length to guarantee correction of 
a fixed number of errors must be large. Moreover, determining the expansion of a given graph is known 
to be NP hard [18], and spectral gap methods cannot guarantee an expansion factor of more than 1/2 
[7]. On the other hand, code parameters such as column weight and girth can be easily determined or 
are assumed to be known for the code under consideration. We prove that for a given column-weight, 
the error correction capability grows exponentially in girth. However, we note that since the girth grows 
logarithmically in the code length, this result does not show that the bit flipping algorithms can correct a 
linear fraction of errors. 

To find an upper bound on the number of correctable errors, we study the size of sets of variable 
nodes which lead to decoding failures. A decoding failure is said to have occurred if the output of the 
decoder is not equal to the transmitted codeword [17]. The conditions that lead to decoding failures are 
well understood for a variety of decoding algorithms such as maximum likelihood decoding, bounded 
distance decoding and iterative decoding on the BEC. However, for iterative decoding on the BSC and 
AWGN channel, the understanding is far from complete. Two approaches have been taken in this direction, 
namely trapping sets [17] and pseudo-codewords [19]. We adopt the trapping set approach in this paper 
to characterize decoding failures. Richardson [17] introduced the notion of trapping sets to estimate the 
error floor on the AWGN channel. In [20], trapping sets were used to estimate the frame error rate of 
column- weigh-three LDPC codes. In this paper, we define trapping sets with the help of fixed points for 
the bit flipping algorithms (both serial and parallel). We then find bounds on the size of trapping sets 
based on extremal graphs known as cage graphs [21], thereby finding an upper bound on the guaranteed 
error correction capability. By saying that a code with column weight 7 and girth 2g' is not guaranteed to 
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correct k errors, we mean that there exists a code with column weight 7 and girth 2g' that fails to correct 
k errors. 

The rest of the paper is organized as follows. In Section II, we provide a brief introduction to LDPC 
codes, decoding algorithms and trapping sets [17]. In Section III, we prove our main theorem relating the 
column weight and girth to the size of variable node sets which expand by a factor of at least 87/4. We 
derive bounds on the size of trapping sets based on cage graphs in Section IV. In Section V, we prove 
that the parallel bit flipping algorithm can correct a fraction of errors if the underlying Tanner graph is a 
good expander. We conclude with a few remarks in Section VI. 

II. Preliminaries 

In this section, we first establish the notation and then proceed to give a brief introduction to LDPC 
codes and hard decision decoding algorithms. We then give the relation between the error correction 
capability of the code and the expansion of the underlying Tanner graph. We finally describe trapping 
sets for the algorithms. 

A. Graph Theory Notation 

We adopt the standard notation in graph theory (see [22] for example). G = (f/, E) denotes a graph 
with set of nodes U and set of edges E. When there is no ambiguity, we simply denote the graph by G. 
An edge e is an unordered pair {ui, U2) of nodes and is said to be incident on ui and U2. Two nodes ui 
and U2 are said to be adjacent (neighbors) if there is an edge e— {ui, U2) incident on them. The order of 
the graph is \U\ and the size of the graph is \E\. The degree of u, d{u), is the number of its neighbors. 
A node with degree one is called a leaf or a pendant node. A graph is d-regular if all the nodes have 
degree d. The average degree of a graph is defined as c? = 2|i5|/|t/|. The girth g{G) of a graph G, is 
the length of smallest cycle m G. H — {V \J C, E') denotes a bipartite graph with two sets of nodes; 
variable (left) nodes V and check (right) nodes C and edge set E'. Nodes in V have neighbors only in C 
and vice versa. A bipartite graph is said to be 7-left regular if all variable nodes have degree 7, p-right 
regular if all check nodes have degree p and (7, p) regular if all variable nodes have degree 7 and all 
check nodes have degree p. The girth of a bipartite graph is even. 

B. LDPC Codes and Decoding Algorithms 

LDPC codes [1] are a class of linear block codes which can be defined by sparse bipartite graphs [23]. 
Let G be a bipartite graph with two sets of nodes: n variable nodes and m check nodes. This graph defines 
a linear block code C of length n and dimension at least n — m in the following way: The n variable nodes 
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are associated to the n coordinates of codewords. A vector v — {vi, V2, ■ ■ ■ , Vn) is a codeword if and only 
if for each check node, the modulo two sum of its neighbors is zero. Such a graphical representation of 
an LDPC code is called the Tanner graph [5] of the code. The adjacency matrix of G gives a parity check 
matrix of C. An (n, 7, p) regular LDPC code has a Tanner graph with n variable nodes each of degree 7 
(column weight) and n'y/p check nodes each of degree p (row weight). This code has length n and rate 
r > 1 - 7/p [23]. 

We now describe a simple hard decision decoding algorithm known as the parallel bit flipping algorithm 
[6], [7] to decode LDPC codes. As noted earlier, each check node imposes a constraint on the neighboring 
variable nodes. A constraint (check node) is said to be satisfied by a setting of variable nodes if the sum 
of the variable nodes in the constraint is even; otherwise the constraint is unsatisfied. 

Parallel Bit Flipping Algorithm 

• In parallel, flip each variable that is in more unsatisfied than satisfied constraints. 

• Repeat until no such variable remains. 

A serial version of the algorithm is also defined in [7] and all the results in this paper hold for the serial 
bit flipping algorithm also. The bit flipping algorithms are iterative in nature but do not belong to the 
class of message passing algorithms (see [8] for an explanation). 

C. Expansion and Error Correction Capability 

Sipser and Spielman [7] analyzed the performance of the bit flipping algorithms using the expansion 
properties of the underlying Tanner graph of the code. We summarize the results from [7] below for the 
sake of completeness. We start with the following definitions from [7]. 

Definition 1: Let G — {U, E) with \U\ — rii. Then every set of at most mi nodes expands by a factor 
of 5 if, for all sets S (Z U 

\S\ < mi ^ \{y ■.3xeS such that {x,y) e E}\ > 6\S\. 
We consider bipartite graphs and expansion of variable nodes only. 

Definition 2: A graph is a (7, p, a, S) expander if it is a (7, p) regular bipartite graph in which every 
subset of at most a fraction of the variable nodes expands by a factor of at least S. 
The following theorem from [7] relates the expansion and error correction capability of an (n, 7, p) LDPC 
code with Tanner graph G when decoded using the parallel bit flipping decoding algorithm. 

Theorem 1: [7, Theorem 11] Let G be a (7, p, a, (3/4 + 6)7) expander over n variable nodes, for any 
e > 0. Then, the simple parallel decoding algorithm will correct any ao < q;(1 + 4e)/2 fraction of errors 
after \ogi_^^{aon) decoding rounds. 
Notes: 
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1) The serial bit flipping algorithm can also correct ckq < q;/2 fraction of errors if G is a (7, p, a, (8/4)7) 
expander. 

2) The results hold for any left regular code as expansion is needed for variable nodes only. 

From the above discussion, it is observed that finding the number of variable nodes which are guaranteed 
to expand by a factor of at least 87/4, gives a lower bound on the guaranteed error correction capability 
of LDPC codes. 

D. Decoding Failures and Trapping Sets 

We now characterize failures of the iterative decoders using fixed points and trapping sets. Some of the 
following discussion appears in [24], [20], [25] and we include it for sake of completeness. 

Consider an LDPC code of length n and let x = {xiX2 ■ ■ ■ Xn) be the binary vector which is the input to 
the iterative decoder. Let 5'(x) be the support of x. The support of x is defined as the set of all positions 
i where Xi ^ 0. The set of variable nodes (bits) which differ from their correct value are referred to as 
corrupt variables. 

Definition 3: [24] A decoder failure is said to have occurred if the output of the decoder is not equal 
to the transmitted codeword. 

Definition 4: x is a fixed point of the bit flipping algorithm if the set of corrupt variables remains 
unchanged after one round of decoding. 

Definition 5: [20] The support of a fixed point is known as a trapping set. A (V, C) trapping set T is 
a set of V variable nodes whose induced subgraph has C odd degree checks. 

If the variable nodes corresponding to a trapping set are in error, then a decoder failure occurs. However, 
not all variable nodes corresponding to a trapping set need to be in error for a decoder failure to occur. 

Definition 6: [20] The minimal number of variable nodes that have to be initially in error for the 
decoder to end up in the trapping set T will be referred to as critical number m for that trapping set. 

Definition 7: [24] A set of variable nodes which if in error lead to a decoding failure is known as a 
failure set. 

III. Column Weight, Girth and Expansion 

In this section, we prove our main theorem which relates the column weight and girth of a code to its 
error correction capability. We show that the size of variable node sets which have the required expansion 
is related to the well known Moore bound [16, p. 180]. We start with a few definitions required to establish 
the main theorem. 



SUBMITTED TO IEEE TRANSACTIONS ON INFORMATION THEORY, MAY 2008 



7 



A. Definitions 

Definition 8: The reduced graph Hr = {V \J C^, E'^) of H = {V U C, E') is a graph with vertex set 
V UCr and edge set E'^ given by 

Cr — C \ Cp, Cp = {c e C : c is a pendant node} 

El = E'\E'^, E'^ = {iv,,c,)eE:c,eC,}. 
Definition 9: Let H — {V U C, E') be such that Vv e V, d{v) < 7. The 7 augmented graph — 
{V U Cj, E'^) is a graph with vertex set V liC^ and edge set E'^ given by 

\v\ 

= C U C„, where = |J and 

= £;'U£;:, where £;; = U^^ and 

1=1 

E': = {(^„c,)eyxC„:c,eC:}. 
Definition 10: [7, Definition 4] The edge-vertex incidence graph G^v — {U \J E, E^v) of G — {U, E) 
is the bipartite graph with vertex set U (J E and edge set 

Eev — {{e,u) e E X U : M is an endpoint of e}. 

Notes: 

1) The edge-vertex incidence graph is right regular with degree two. 

2) \Eev\ = 2|-E|. 

3) giG,,) = 2g{G). 

Definition 11: An inverse edge-vertex incidence graph Higy — {V, E'-^^) of H — {VLiC, E') is a graph 
with vertex set V and edge set E^^^ which is obtained as follows. For c e Gr, let N{c) denote the set of 
neighbors of c. Label one node vi e N{c) as a root node. Then 

EL = {{vi,v^)eVxV:VieN{c),v^eN{c), 
i 7^ j, Vi is a root node, for some c E Gr}. 

Notes: 

1) Given a graph, the inverse edge- vertex incidence graph is not unique. 

2) g{H,,,) > g{H)/2, \EiJ = \E',\ - \Cr\ and \Cr\ < \E',\/2. 

3) l-E-g^l > \E'^\/2 with equality only if all checks in Gr have degree two. 

4) The term inverse edge-vertex incidence is used for the following reason. Suppose all checks in H 
have degree two. Then the edge-vertex incidence graph of Hiev is H. 
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The Moore bound [16, p.l80] denoted by no{d, g) is a lower bound on the least number of vertices in 
a d-regular graph with girth g. It is given by 

r-l 

no(d,g') = no(d,2r + 1) = l + d^^^d- l)\ g odd 

i=0 

r-l 

no{d,g) = no{d,2r) = 2 y^((j - 1)', g even. 

1=0 

In [26], it was shown that a similar bound holds for irregular graphs. 

Theorem 2: [26] The number of nodes n{d, g) in a graph of girth g and average degree at least d>2 
satisfies 

n{d,g) > no(d,g). 
Note that d need not be an integer in the above theorem. 

B. The Main Theorem 

We now state and prove the main theorem. 

Theorem 3: Let G be a 7 > 4-left regular Tanner graph G with g{G) = 2g'. Then for all k < no{-f/2, g'), 
any set of k variable nodes in G expands by a factor of at least 87/4. 

Proof: Let G'' = (V'' U G'', E^) denote the subgraph induced by a set of k variable nodes V^. Since 
G is 7-left regular, \E^\^ jk. Let = {V'' U C^, E^) be the reduced graph. We have 

IG'^l = \G^\ + \G^\ 
\E''\ = lE^l + lE'^l 

\G'^\ = ^k-\E^\. 

We need to prove that jC*^! > S^k/A. 

Let f{k, g') denote the maximum number of edges in an arbitrary graph of order k and girth g' . By 
Theorem 2, for all k < 77,0(7/2, g'), the average degree of a graph with k nodes and girth g' is less than 
7/2. Hence, f{k,g') < 'jk/A. We now have the following lemma. 

Lemma 1: The number of edges in G^ cannot exceed 2f{k,g') i.e., 

\E^\<2f{k,g'). 

Proof: The proof is by contradiction. Assume that \E^\ > 2f{k,g'). Consider Gfg^ = (y'^,E^^^), an 
inverse edge vertex incidence graph of G^. We have 

\EL\>m9'). 
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This is a contradiction as G^j^ is a graph of order k and girth at least g'. ■ 
We now find a lower bound on \C'^\ in terms of f{k,g'). We have the following lemma. 

Lemma 2: \C^\ > 7/1; — f{k,g'). 

Proof: Let \E^\ = 2f{k, g') — j for some integer j > 0. Then \E^\ = 7A; — 2f{k, g') + j. We claim 
that |C^| > f{k,g')+j. To see this, we note that 

\Efj = \E^\ \C^\, or 



I /^fc I I rpk I I rpk 

I'-^r I — \^r\ ~ l-^ievl 



But 



Hence we have. 





E^ 1 

iev\ 








\Cr\ 


> 2f(k,g')-j 


-fiKg') 




\Cr\ 


> f{k,g')-j. 








\Cr\ + \Cp\ 




\C''\ 


> 


m 9') -j+ik 


-2f(k,g')+j 


^ \C''\ 


> 


lk-f%g'). 





The theorem now follows as 

f{k,g')<^k/4. 

and therefore 

■ 

Corollary 1: Let C be an LDPC code with column- weight 7 > 4 and girth 2g'. Then the bit flipping 
algorithm can correct any error pattern of weight less than no{'y/2,g')/2. 

IV. Cage Graphs and Trapping Sets 

In this section, we first give necessary and sufficient conditions for a given set of variables to be a 
trapping set. We then proceed to define a class of interesting graphs known as cage graphs [21] and 
establish a relation between cage graphs and trapping sets. We then give an upper bound on the error 
correction capability based on the sizes of cage graphs. The proofs in this section are along the same 
lines as in Section III. Hence, we only give a sketch of the proofs. 
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Theorem 4: Let C be an LDPC code with 7-left regular Tanner graph G. Let T he a set consisting of 
V variable nodes with induced subgraph I. Let the checks in X be partitioned into two disjoint subsets; O 
consisting of checks with odd degree and S consisting of checks with even degree. Then T is a trapping 
set for bit flipping algorithm iff : (a) Every variable node in X has at least [7/2] neighbors in £, and (b) 
No [7/2J + 1 checks of O share a neighbor outside I. 

Proof: We first show that the conditions stated are sufficient. Let x-r be the input to the bit flipping 
algorithm, with support T. The only unsatisfied constraints are in O. By the conditions of the theorem, 
we observe that no variable node is involved in more unsatisfied constraints than satisfied constraints. 
Hence, no variable node is flipped and by definition x-r is a fixed point implying that T is a trapping set. 

To see that the conditions are necessary, observe that for x-j to be a trapping set, no variable node 
should be involved in more unsatisfied constraints than satisfied constraints. ■ 

Remark: Theorem 4 is a consequence of Fact 3 from [17]. 

To determine whether a given set of variables is a trapping set, it is necessary to not only know the 
induced subgraph but also the neighbors of the odd degree checks. However, in order to establish general 
bounds on the sizes of trapping sets given only the column weight and the girth, we consider only condition 
(a) of Theorem 4 which is a necessary condition. A set of variable nodes satisfying condition (a) is known 
as a potential trapping set. A trapping set is a potential trapping set that satisfies condition (b). Hence, 
a lower bound on the size of the potential trapping set is a lower bound on the size of a trapping set. 
It is worth noting that a potential trapping set can always be extended to a trapping set by successively 
adding a variable node till condition (b) is satisfied. 

Definition 12: [21] A (d, g)-cage graph, G{d, g), is a d-regular graph with girth g having the minimum 
possible number of nodes. 

A lower bound, ni{d,g), on the number of nodes nc{d,g) in a (d,g)-cage graph is given by the Moore 
bound. An upper bound nu{d,g) on nc{d,g) (see [21] and references therein) is given by 

I + 29 2S-2 for g odd 

I + fl 2^-2 for g even 

2{d-iy-^ for g odd 
4{d — iy~^ for g even 

Theorem 5: Let C be an LDPC code with 7-left regular Tanner graph G and girth 2g'. Let T(7, 2g') 
denote the size of smallest possible potential trapping set of C for the bit flipping algorithm. Then, 

\T{^,2g')\^n,{\^/2],g'). 
Proof: We first prove the following lemma and then exhibit a potential trapping set of size nc( [7/2] , g'). 
Lemma 3: \T{^,2g')\ > n,{\^/2] ,g'). 



nu{d, g) 
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Proof: Let 7^ be a trapping set with \Ti\ < nc([7/2] ^g') and let Gi denote the induced subgraph 
of Ti. We can construct a ([7/2] ,g")- cage graph {g" > g) with \%\ < nc{\^/2] , g') nodes by removing 
edges (if necessary) from the inverse edge-vertex of Gi which is a contradiction. ■ 
We now exhibit a potential trapping set of size nc{ [7/2] , g'). Let Gev{\7/'2 \ , g') be the edge-vertex inci- 
dence graph of a G ( [7/2] ,g'). Note that Gev ( f 7/2] , gf') is a left regular bipartite graph with ( [7/2] ,g') 
variable nodes of degree [7/2] and all checks have degree two. Now consider ^£^^^([7/2] ,g'), the 7 
augmented graph of G'e,;([7/2] ,g'). It can be seen that 6*6^,7 ([7/2] ,g') is a potential trapping set. ■ 
Theorem 6: There exists a code C with 7-left regular Tanner graph of girth 2g' which fails to correct 
ndll/'A ,g') errors. 

Proof: Let G'eu,^( [7/2] , g') be as defined in Theorem 5. Now construct a code C with column-weight 
7 and girth 2g' starting from Gev,yi\l/2] , g') such that the set of variable nodes in Gev,'yi\'y /2] , g') also 
satisfies condition (b) of Theorem 4. Then, by Theorem 4 and Theorem 5, the set of variable nodes in 
Gev,j{\l/'^] ,g') with cardinality nc([7/2] ,g') is a trapping set and hence C fails to decode an error 
pattern of weight nc( [7/2] ,51'). ■ 
Remark: We note that for 7 = 3 and 7 = 4, the above bound is tight. Observe that for d = 2, the 
Moore bound is nQ(d,g) = g and that a cycle of length 2g with g variable nodes is always a potential 
trapping set. In fact, for a code with 7 = 3 or 4, and Tanner graph of girth greater than eight, a cycle of 
the smallest length is always a trapping set (see [24] for the proof). 

V. Generalized LDPC Codes 

In this section, we first consider two bit flipping decoding algorithms for GLDPC codes. We then 
establish a relation between expansion and error correction capability. We also establish a lower bound 
on the number of variable nodes that have the required expansion. We then exhibit a trapping set and as 
a consequence show that the bound on the required expansion cannot be improved when 7 is even. We 
also establish bounds on the size of trapping sets. 

We begin with the definition of GLDPC codes by adopting the terminology from expander codes [7]. 

Definition 13 (Definition 6, [7]): : Let G be a (7, p) regular bipartite graph between n variable nodes 
(vi, V2, . . . , Vn) and wy/p check nodes (ci, C2, . . . , Cny/p). Let b{i,j) be a function designed so that, for 
each check node q, the variables neighboring q are f6(j,2)) • • • Let S be an error correcting 

code of block length p. The GLDPC code C{G,S) is the code of block length n whose codewords are 
the words (xi, ^2, . . . , x„) such that, for 1 < i < 717/^, . . . , Xb(^i^p)) is a codeword of S. 

The terms column-weight, row-weight, check nodes, variable nodes and trapping sets mean the same 
as in case of LDPC codes. The code S at each check node is sometimes referred to as the sub-code. 
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A. Decoding algorithms 

Tanner [5] proposed different hard decision decoding algorithms to decode GLDPC codes. We now 
describe an iterative algorithm known as parallel bit flipping algorithm originally described in [5], which 
is employed when the sub-code is capable of correcting t errors. 

Parallel bit flipping algorithm: Each decoding round consists of the following steps. 

• A variable node sends its current estimate to check nodes. 

• A check node performs decoding on incoming messages and finds the nearest codeword. For all 
variable nodes which differ from the codeword, the check node sends a flip message. If the check 
node does not find a unique codeword, it does not send any flip messages. 

• A variable node flips if it receives more than 7/2 flip messages. 

The set of variable nodes which differ from their correct value are known as corrupt variables. The rest 
of the variable nodes are referred to as correct variables. Following the algorithms, we have the following 
definition adopted from [7]: 

Definition 14: A check node is said to be confused if it sends flip messages to correct variable nodes, 
or if it does not send flip message to corrupt variable nodes, or both. Otherwise, a check node is said to 
be helpful. 
Remarks: 

1) For the parallel bit flipping decoding algorithm, a check node with sub-code of minimum distance 

1 can be confused only if it is connected to more than t corrupt variable nodes. 

2) The parallel bit flipping algorithm is different from the algorithm presented by Sipser and Spielman 
in [7] for expander codes, but is similar to the algorithm proposed by Zemor in [12]. However, 
we note that the codes considered in [12] are based on d-regular bipartite graphs and are a special 
case of doubly generalized LDPC codes, where each variable node is also associated with an error 
correcting code. 

3) Apart from helpful checks and confused checks, Sipser and Spielman defined unhelpful checks. 
However, our definition of confused checks includes unhelpful checks as well. 

4) Miladinovic and Fossorier in [15] considered a decoding algorithm where the decoding at every 
check either results in correct decoding or a failure but not miscorrection. While this assumption is 
reasonable when the sub-code is a long code, it is not true in general. We however, point out that 
the methodology we adopt can be applied to this case as well. 

5) The work by Sipser and Spielman [7], Zemor [12], Barg and Zemor [13] and Janwa and Lai [14] 
focused on asymptotic results and explicit construction of expander codes. The proofs and con- 
structions are based on spectral gap and as noted earlier, such methods cannot guarantee expansion 
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factor of more than 1/2. Our proofs require a greater expansion factor. 

B. Expansion and Error Correction Capability 

We now prove that the above described algorithm can correct a fraction of errors if the underlying 
Tanner graph is a good expander. 

Theorem 7: Let C{G, S) be a GLDPC code with a 7-left regular Tanner graph G. Assume that the 
sub-code S has minimum distance at least dmin — 2t + l and is capable of correcting t errors. Let G be 

a (7, p, a, j3^) expander where 

Then the parallel bit flipping decoding algorithm will correct any < a fraction of errors. 

Proof: Let n be the number of variable nodes in C. Let V be the set of corrupt variables at the 
beginning of a decoding round. Assume that \V\ < an. We will show that after the decoding round, the 
number of corrupt variables is strictly less than \V\. 

Let F be the set of corrupt variables that fail to flip in one decoding round, and let C be the set of 
variables that were originally uncorrupt, but which become corrupt after one decoding round. After one 
decoding round, the set of corrupt variables is F U C In the worst case scenario, a confused check sends 
t flip messages to the uncorrupt variables and no flip message to the corrupt variables. We now have the 
following lemma: 

Lemma 4: Let Ck be the set of confused checks, then 

lai < (1^. ,„ 

Proof: The total number of edges connected to the corrupt variables is ^\V\. Each confused check 
must have at least t-\-l neighbors in V. Let S be the set of helpful checks that have at least one neighbor 
in V. Then, 



By expansion. 



By (2) and (3), we obtain 



l\V\ > \Ck\{t+l) + \S\. (2) 



\S\ + \Ck\> P7\V\. (3) 



(1-(3)^\V\ 



t 
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We now prove that |F U C| < \V\. The proof is by contradiction. Assume that |F U C| > \V\. Then there 
exists a subset C" C C such that |F U C"| = \V\. We observe that a variable node in F can have at most 

[7/2J neighbors that are not in Ck- Also, a variable node in C must have at least [7/2J +1 neighbors 
in Ck, and hence can have at most [7/2] — 1 neighbors that are not in Ck- Let N{F U C) be the set of 
neighbors of F U C. Then, 



N{FUC') < n\ + [l\\F\ + [\l]-l)\C'\ 



Substituting (1) into (4), we obtain 



< \Ck\ + ^\F\ + ^\C'\ = \Ck\ + ^\V\. (4) 



N{FUC')<(l-^+^-y\V\. 



Now 



t + 2 

/3> 



2(^+1) 

1-/3 2/3-1 

=> < — 

t 2 

=> N(F U C) < (^'ylVl 

which is a contradiction. ■ 
Remark: The above theorem proves that the parallel bit flipping algorithm can correct a fraction of 

errors in linear number of rounds (in code length). However, if we assume an expansion of (/3 + 6)7, it 

can be shown that the number of errors decreases by a constant factor with every iteration resulting in 

convergence in logarithmic number of rounds. 

The following theorem establishes a lower bound on the number of nodes in a left regular graph which 

expand by a factor required by the above algorithms. 

Theorem 8: Let G be a 7-left regular bipartite graph with g{G) — 2g'. Then for all k < no{'yt/ {t+l),g'), 

any set of k variable nodes in G expands by a factor of at least P^, where 

^ 2(t + l) 

Proof: The proof is similar to the proof of Theorem 3. Following the notation from Theorem 3, we 
note that for all k < no{jt/{t + l),g'), 

k'-ft 



f{k,g')< 



2{t+iy 

Since IC*^] >^k — f{k, g'), we have 

' ' 2(t + l)' 
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Note that the above theorem holds when jt/{t + 1) >2. 

Corollary 2: Let C{G,S) be a GLDPC code with a 7-left regular Tanner graph G and g{G) = 2g'. 
Assume that the sub-code S has minimum distance at least dmin = 2t + 1 and is capable of correcting t 
errors. Then the parallel bit flipping algorithm can correct any error pattern of weight less than Uoi'ft/ {t + 

C. Trapping Sets of GLDPC Codes 

We now exhibit a trapping set for the parallel bit flipping algorithm. By examining the expansion of 
the trapping set, we show that the bound given in Theorem 7 cannot be improved when 7 is even. 

Theorem 9: Let C be a GLDPC code with 7-left regular Tanner graph G. Let T he a set consisting of 
V variable nodes with induced subgraph I with the following properties: (a) The degree of each check 
in I is either 1 ov t + 1; (b) Each variable node in V is connected to [7/2] checks of degree t+1 and 
[7/2 J checks of degree 1; and (c) No [7/2 J -|- 1 checks of degree t + 1 share a variable node outside 2. 
Then, T is a trapping set. 

Proof: Observe that all the checks of degree t+1 in I are confused. Further, each confused check 
does not send flip messages to variable nodes in V. Since any variable node in V is connected to [7/2] 
confused checks, it remains corrupt. Also, no variable node outside X can receive more than [7/2J flip 
messages. Hence, no variable node which is originally correct can get corrupted. By definition, T is a 
trapping set. 

It can be seen that the total number of checks in I is equal to |y|([7/2j -|- [7/2] 1)). Hence, the 
set of variable nodes V expands by a factor of ^{t + 2)/(2(t + 1)) when 7 is even. Hence, the bound 
given in Theorem 7 cannot be improved in this case. ■ 

For a set of variable nodes to be a trapping set, it is necessary that every variable node in the set is 
connected to at least [7/2] confused checks. This observation leads to the following bound on the size 
of trapping sets. 

Theorem 10: Let C be a GLDPC code with 7-left regular Tanner graph G and g{G) — 2g' . Let 
nc{di,dr,2g') denote the number of left vertices in a {di,dr) regular bipartite graph of girth 2g' . Then the 
size of the smallest possible trapping set of C is nc([7/2] ,t + l, 2g'). 

Proof: Follows from Theorem 5 and Theorem 9 ■ 

Corollary 3: Let C{G,S) be a GLDPC code with a 7-left regular Tanner graph G and g{G) = 2g' . 
Assume that the sub-code S has minimum distance at least dmin = 2t + 1 and is capable of correcting t 
errors. Then the parallel bit flipping algorithm cannot be guaranteed to correct all error patterns of weight 
greater than or equal Xo nc{\^ /2'\ ,t + 1, 2g'). 
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VI. Concluding Remarks 

We derived lower bounds on the guaranteed error correction capability of LDPC and GLDPC codes 
by finding bounds on the number of nodes that have the required expansion. The bounds depend on two 
important code parameters namely: column- weight and girth. Since the relations between rate, column- 
weight, girth and code length are well explored in the literature (see [1], [5] for example), bounds on 
the code length needed to achieve certain error correction capability can be derived for different column 
weights and sub-codes (for GLDPC codes). The bounds presented in the paper serve as guidelines in 
choosing code parameters in practical scenarios. 

The lower bounds derived in this paper are weak. However, extremal graphs avoiding three, four and 
five cycles have been studied in great detail (see [27], [28]) and these results can be used to derive tighter 
bounds when the girth is eight, ten or twelve. Also, since an expansion factor of 87/4 is not necessary 
(see [7, Theorem 24]) for LDPC codes, it is possible that tighter lower bounds can be derived for some 
cases. The results can be extended to message passing algorithms as well. There is a considerable gap 
between the lower bounds and upper bounds on the error correction capability. Deriving lower bounds 
based on the sizes of trapping sets rather than expansion may possibly lead to bridging this gap. 

Our approach can be used to derive bounds on the guaranteed erasure recovery capability for iterative 
decoding on the BEC by finding the number of variable nodes which expand by a factor of 7/2. In [4], 
the bounds on the guaranteed erasure recovery capability were derived based on the size of the smallest 
stopping set. Both approaches give the same bounds, which also coincide with the bounds given by Tanner 
[5] for the minimum distance. Results similar to the ones reported by Miladinovic and Fossorier [15] based 
on the size of generalized stopping sets can also be derived. 
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